An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

نویسندگان

Chaitanya Kulkarni

Wei Xu

Alan Ritter

Raghu Machiraju

چکیده

We describe an effort to annotate a corpus of natural language instructions consisting of 662 wet lab protocols to facilitate automatic or semi-automatic conversion of protocols into a machine-readable format and benefit biological research. Experimental results demonstrate the utility of our corpus for developing machine learning approaches to shallow semantic parsing of instructional texts. We will make our annotated corpus available to the research community upon publication.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Corpus based coreference resolution for Farsi text

"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...

متن کامل

پیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی

Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...

متن کامل

روشی جدید جهت استخراج موجودیت‌های اسمی در عربی کلاسیک

In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...

متن کامل

An Aligned French-Chinese corpus of 10K segments from university educational material

This paper describes a corpus of nearly 10K French-Chinese aligned segments, produced by postediting machine translated computer science courseware. This corpus was built from 2013 to 2016 within the MACAU project, by native Chinese students. The quality, as judged by native speakers, is adequate for understanding (far better than by reading only the original French) and for getting better mark...

متن کامل

Reading English in the Computer Lab

The present study compares the performance of two TEFL reading classes: one taking place in a regular classroom and the other held in a computer lab, with the learners practicing reading online. The results of an independent samples t-test showed that the difference between the learners’ scores on their reading comprehension post-tests and pretests did not differ statistically significantly fro...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

نویسندگان

چکیده

منابع مشابه

Corpus based coreference resolution for Farsi text

پیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی

روشی جدید جهت استخراج موجودیت‌های اسمی در عربی کلاسیک

An Aligned French-Chinese corpus of 10K segments from university educational material

Reading English in the Computer Lab

عنوان ژورنال:

اشتراک گذاری